video/tts audio clip by Peggy0422 · Pull Request #668 · onvif/specs

Peggy0422 · 2025-11-11T14:10:28Z

To support audio product with TTS function, serveral operations should be done, which are:

TTSCapabilities(Optional): Add complex type TTSCapabilities to the exsiting complex Type "AudioClipCapabilities" as optional, to indicate whether the device is capable of TTS function and the detailed configuration if so.
parameter:
MaxContentLength: the Max length of the content in a text file that device could convert into an audio clip;
TTSLanguage: indicates what languages the device supports for client to choose to perform TTS.
TTSVoiceType: indicates what types of voice that device supports when device play an audio clip converted from a text.
Add “AddTTSAudioClip”and "AddTTSAudioClipResponse"element: To send a text and its configuartion to device that supports TTS, so that device could convert it into an audio clip and play it according to Configuration and TTS Configuration.
Parameter:
Token(Optional): token for the audio clip.
Configuration: Audio clip configuration to add, reference to Configuration for AddAudioClip.
TTSConfiguration: The configuration for the TTS audio clip to add, it specifys the audio content, language and voice type when device play this audio clip.
Reponse:
Token: Unique token of the TTS audio clip to be uploaded.

media2.wsdl

Added AddTTSAudioClip request and AddTTSAudioClip response for sending a text and its TTS configuration to the device
Added complex types "TTS Audio" for TTSConfiguration to support TTS function. It includes parameters Content, Language, VoiceType.
updated AudioClipCapabilities with TTSCapabilities, and added complex types for TTSCapabilitiesto indicate the device supports TTS function and its corresponding configuration.
complex types TTSCapabilities includes MaxContentLength, TTSLanguage and TTSVoiceType.
Added simpleType TTSLanguage and TTSVoiceType.

media2.xml and documentation

Added detailed descriptions for AddTTSAudioClip operations, explaining their purpose, parameters, and responses.
updated audio clip Capabilities with TTSCapabilities.
ONVIF-Media2-Service-Spec-TTS update.docx

1. Added AddTTSAudioClip request and AddTTSAudioClip response for sending a text and its TTS configuration to the device（1621-1652）（2036-2041）（2418-2422）（2935-2943）. 2. Added complex types "TTS Audio" （1465-1485）for TTSConfiguration to support TTS function. It includes parameters Content, Language, VoiceType. 3. updated AudioClipCapabilities with TTSCapabilities（177-181）, and added complex types for TTSCapabilities（201-220）to indicate the device supports TTS function and its corresponding configuration. complex types TTSCapabilities includes MaxContentLength, TTSLanguage and TTSVoiceType. 4. Added simpleType TTSLanguage（220-231） and TTSVoiceType（232-238）.

1. Added detailed descriptions for AddTTSAudioClip operations, explaining their purpose, parameters, and responses.（2359-2416） 2. updated audio clip Capabilities with TTSCapabilities.（2698-2700）

update code line information for TTS function

correct some editorial errors

venki5685 · 2025-11-12T00:41:59Z

wsdl/ver20/media/wsdl/media.wsdl

+								<xs:documentation>Audio clip configuration to add.</xs:documentation>
+							</xs:annotation>
+					    </xs:element>
+                        <xs:element name="TTSConfiguration" type="tr2:TTSAudio">


is TSSConfiguration for audio clip is returned in GetAudioClips API response? If not, how client can query TSSConfiguration for the given audio clip.

No, there is no TTSConfiguration for audio clip returned in GetAudioClips API response. TTS configuration is just for device to convert a text to an audio clip, and it is stored in device just like other audio clips. So far, there is no use case for querying TTSConfiguration in GetAudioClips API response. If considering distinguish TTS audio clip and pre-recorded audio clip, client could consider to use element "name".

ocampana-videotec · 2025-11-12T06:12:01Z

doc/Media2.xml

+          <varlistentry>
+            <term>faults</term>
+            <listitem>
+              <para role="param">env:Receiver - ter:Action - ter:MaxAudioClipLimit</para>


I propose to rename ter:MaxAudioClipLimit to ter:MaxAudioClip to unifor with similar errors for other functions

MaxAudioClipLimit parameter was added as part of Audio Clip Management feature and the technical specification for this feature is released in ONVIF V25.06. Changing the parameter name now can cause backward combability issue.

ocampana-videotec · 2025-11-12T06:17:56Z

wsdl/ver20/media/wsdl/media.wsdl

 			</xs:complexType>
 			<!--===============================-->
+            <!--=============TTS Capability=================-->
+            <xs:complexType name="TTSCapabilities">


Should we also have the maximum number of clips? Since the device can return ter:MaxAudioClip , the limit should be available as a capability

TTS audio clip is actually an audio clip, there is an attribute"MaxAudioClipLimit" in AudioClipCapabilities already, it can cover TTS audio clip.

Updated the description of the AddTTSAudioClip operation to clarify the parameters and response. Updated the description of TTScapabilities.

ocampana-videotec · 2025-12-04T12:31:39Z

@Peggy0422 I do not understand the relationship between this PR, #692 and #694 . What is the right one?

sujithhanwha · 2025-12-04T12:34:03Z

Closing this PR since already a new PR is open for the same feature.

Peggy0422 added 4 commits November 10, 2025 10:50

Update Media2.xml

d2607c7

1. Added detailed descriptions for AddTTSAudioClip operations, explaining their purpose, parameters, and responses.（2359-2416） 2. updated audio clip Capabilities with TTSCapabilities.（2698-2700）

Update media.wsdl

043366e

update code line information for TTS function

Update media.wsdl

43f83bf

correct some editorial errors

venki5685 reviewed Nov 12, 2025

View reviewed changes

ocampana-videotec reviewed Nov 12, 2025

View reviewed changes

ocampana-videotec added IPR needed 26.06 WG_enh labels Nov 12, 2025

Merge branch 'onvif:video/TTS-audio-clip' into video/TTS-audio-clip

46b11bb

Peggy0422 marked this pull request as ready for review December 1, 2025 06:18

Update Media2.xml

ea5b5dd

Updated the description of the AddTTSAudioClip operation to clarify the parameters and response. Updated the description of TTScapabilities.

ocampana-videotec removed the IPR needed label Dec 4, 2025

sujithhanwha closed this Dec 4, 2025

sujithhanwha mentioned this pull request Dec 4, 2025

For ONVIF TTS audio proposal, to support device with TTS function #694

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

video/tts audio clip#668

video/tts audio clip#668
Peggy0422 wants to merge 6 commits intoonvif:developmentfrom
Peggy0422:video/TTS-audio-clip

Peggy0422 commented Nov 11, 2025

Uh oh!

venki5685 Nov 12, 2025

Uh oh!

Peggy0422 Nov 12, 2025

Uh oh!

ocampana-videotec Nov 12, 2025

Uh oh!

venki5685 Nov 13, 2025

Uh oh!

ocampana-videotec Nov 12, 2025

Uh oh!

Peggy0422 Nov 13, 2025

Uh oh!

ocampana-videotec commented Dec 4, 2025

Uh oh!

sujithhanwha commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Peggy0422 commented Nov 11, 2025

Uh oh!

venki5685 Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Peggy0422 Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

ocampana-videotec Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

venki5685 Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

ocampana-videotec Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Peggy0422 Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

ocampana-videotec commented Dec 4, 2025

Uh oh!

sujithhanwha commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants